Standards in Genomic Sciences (2014) 9:755-762 



DOI:10.4056/sigs.5429581 



High quality draft genome sequence of Staphylococcus cohnii 
subsp. cohnii strain hu-01 

Xinjun Hu''^, Ang Li^'^, LongXian Lv^'^, Chunhui Yuan^'^, Lihua Guo^'^, Xiawei Jiang^'^, Haiyin 
Jiang^'^, GuiRong Qian^'^, BeiWen Zheng^ ^ Jing Guo^ ^ Lanjuan Li^'^* 

^State Key Laboratory for Diagnosis and Treatment of Infectious Disease, The First Affiliated 
Hospital, Zhejiang University, Hangzhou, PR China. 

^Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Hang- 
zhou, China 

* Corresponding author: ljli@zju.edu.cn. 

Keywords: Staphylococcus cohnii subsp. cohnii, genome, Hiseq2000 



Staphylococcus cohnii subsp. cohnii belongs to the family Staphylococcaceae in the order 
Bacillales, class Bacilli and phylum Firmicutes. The increasing relevance of S. cohnii to human 
health prompted us to determine the genomic sequence of Staphylococcus cohnii subsp. 
cohnii strain hu-01, a multidrug-resistant isolate from a hospital in China. Here we describe 
the features of S. cohnii subsp. cohnii strain hu-01, together with the genome sequence and 
its annotation. This is the first genome sequence of the species Staphylococcus cohnii. 



Introduction 



Staphylococcus cohnii belongs to the Coagulase- 
Negative Staphylococci group. It was described by 
Schleifer and Kloos [1975) and was named for 
Ferdinand Cohn, a German botanist and bacteriol- 
ogist [1]. Recently, more cases of Staphylococcus 
cohnii infection have been reported in the litera- 
ture. This organism may be responsible for brain 
abscess, pneumonia, acute cholecystitis, endocar- 
ditis, bacteremia, urinary tract infection and septic 
arthritis [2]. S. cohnii is comprised of two subspe- 
cies that are defined on the basis of their pheno- 
typic characteristics: Staphylococcus cohnii subsp. 
cohnii and Staphylococcus cohnii subsp. urealyticus 
[3]. S. cohnii subsp. cohnii is a Gram-positive 
coccus, coagulase negative and catalase positive, 
that behaves like a commensal mucocutaneous 
bacterium [4]. It has more frequently been isolat- 
ed in hospital than in non-hospital environments 
[2]. Here we report this draft genome of 5. cohnii 



subsp. cohnii strain hu-01, the first genome of this 
species to be sequenced. 



Strain hu-01 was isolated from a hospital envi- 
ronment in Zhejiang province, China, in October 
2012. It is a Gram-positive, coccus-shaped bacte- 
rium that can grow on 5% sheep blood enriched 
Columbia agar (BioMerieux, Marcyl'Etoile, France) 
at 37°C. Growth occurs under either aerobic or 
anaerobic conditions. The optimum temperature 
for growth is 37 -C, with a temperature range of 
15-45 -C [Table 1). Cell morphology, motility and 
sporulation were examined by using transmission 
electron (H-600, Hitachi) microscopy. Cells of 
strain hu-01 are coccoidal, 0.6 to 1.2 \im in diame- 
ter, occurring predominantly singly or in pairs 
[Figure 1 and Figure 2). 



Classification and features 



1^1 
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Staphylococcus cohnii subsp. cohnii 




Figurel. Gram staining of S. cohnii subsp. cohnii strain hu-01 




Figure 2. Transmission electron micrograph of cells of strain hu-01 . Bar: 0.5 pm 
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Table! . Classification and general features of 5. cohnii subsp. cohnii strain hu-01 according to the 
MIGS recommendations [9]. 



MIGS ID 


Property 


Term 


Evidence code" 






Domain Bacteria 


TAS [20] 






Phylum Firmicutes 


TAS [21-23] 






Class Bacilli 


TAS [24,25] 




Current classification 


Order Bacillales 

1 all Illy ^LaUi ly i\J\^\J\^\^a\^KzcXK: 

Genus Staphylococcus 
Species Staphylococcus 
cohnii subsp. cohnii 

Strain hu-01 


TAS [26,27] 

TAS [26,29- 
31] 

TAS [1,3] 
IDA 




Gram stain 


Positive 


IDA 




Cell shape 


coccus 


IDA 




Motility 


Nonmotile 


IDA 




Sporulation 


Nonsporulating 


IDA 




Temperature range 


15-45°C 


IDA 




Optimum temperature 


37°C 


IDA 


MIGS-6.3 


Salinity 


Tolerates 10% NaCI 


IDA 


MIGS-22 


Oxygen 
Carbon source 


Facultatively anaerobic 
D-mannitol fructose treha- 
lose 


IDA 
IDA 




Energy source 


fructose, trehalose 


IDA 


MlGS-6 


Habitat 


Hospital environment 


IDA 


MlGS-15 


Biotic relationship 


Free living 


IDA 


MlGS-14 


Pathogenicity 


Opportunistic pathogen 


IDA 




Isolation 


Hospital 


IDA 


MlGS-4 
MIGS-5 


Geographic location 
Samnle rnllprtinn 
time 


Hangzhou, China 
October, 2012 


IDA 

IDA 


MIGS-4.1 


Latitude 


30°16'N 


IDA 


MIGS-4.2 


Longitude 


120°12'E 


IDA 


MlGS-4.3 


Depth 


unknown 


IDA 


MlGS-4.4 


Altitude 


50 (meters) 


IDA 



''Evidence codes-IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct 
report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for 
the living, isolated sample, but based on a generally accepted property for the species, or anecdo- 
tal evidence). These evidence codes are from the Gene Ontology project [32]. If the evidence 
code is IDA, then the property should have been directly observed, for the purpose of this specific 
publication, for a live isolate by one of the authors, or an expert or reputable institution men- 
tioned in the acknow^ledgements. 
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Figure 3. Phylogenetic tree depicting the relationship between 5. cohnii subsp. cohnii strain hu-01 and other mem- 
bers of the genus Staphylococcus. The strains and their corresponding Genbank accession numbers are shown fol- 
lowing the organism name and indicated in parentheses. The phylogenetic tree uses 16S rRNA gene sequences 
aligned by the CLUSTALW [7], and phylogenetic inferences were made using Neighbor-joining method based on 
Kimura 2-parameter model within the MEGA5 software [8] and rooted with Bacillus subtilis subsp. subtilis. Bootstrap 
consensus trees were inferred from 100 replicates, only bootstrap values > 50% were indicated. 



Comparative 16S rRNA gene sequence analysis by 
BLASTN [5,6] using the NCBI-NR/NT database re- 
vealed 94-99% sequence similarity to members of 
genus Staphylococcus. Neighbor-Joining phyloge- 
netic analysis based on Kimura 2-parameter mod- 
el indicated the strain hu-01 is most closely relat- 
ed the strain Staphylococcus cohnii subsp. 
urealyticus (AB009936.1) (Figure 3). 

Biochemical features were tested by using two au- 
tomated systems, the Vitek2 Compact [bioMerieux, 
Marcy I'Etoile, France) and Phoenix 100 ID/AST 
system (Becton Dickinson Company [BD], Sparks, 



Maryland, USA). Positive reactions were obtained 
for D-fructose, trehalose, D-gluconic add and D- 
mannitol. Negative reactions were observed for 
glucose, D-trehalose, D-sucrose, maltose, urea, 
cellobiose, glucoside, D-tagatose and maltotriose. 
This strain was susceptible to gentamicin, ciprof- 
loxacin, levofloxacin, moxifloxacin, quinupristin, 
linezolid, vancomycin, tetracycline, tigecycline, 
nitrofurantoin, rifampicin, trimethoprim and re- 
sistant to cefoxitin, benzylpenicillin, oxacillin, 
erythromycin, clindamycin. 
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Genome sequencing information 

Genome project history 

S. conhii subsp. cohnii strain hu-01 was selected for 
sequencing because of its increasing relevance to 
human health. The strain was isolated from a hos- 
pital environment in China. This whole genome 
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shotgun project of S. conhii subsp. cohnii strain hu- 
01 was deposited at DDBJ/EMBL/GenBank under 
the accession AYOSOOOOOOOO. Table 2 presents 
the project information and its association with 
MIGS version 2.0 compHance [9]. 



Table 2. Project information 


MIGS ID 


Property 


Term 


MlCS-31 


Finishing quality 


High-quality draft 


MIGS-28 


Libraries used 


One pair-end 500 bp library 


MIGS-29 


Sequencing platforms 


lllumina HiSeq 2000 


MIGS-31.2 


Fold coverage 


1 50x(based on 500 bp library) 


MlGS-30 


Assemblers 


Velvet 1.2.07 


MlGS-32 


Gene calling method 


Glimmer 3.0 




Genbank ID 


AYOSOOOOOOOO 




Genbank Date of Release 


Jan 06, 2014 




GOLD ID 


Gi0062613 


MlGS-13 


Project relevance 


Biotechnology, Pathw^ay, Pathogenic 



Growth conditions and DNA isolation 

5. conhii subsp .cohnii strain hu-01 was grown aer- 
obically on Columbia blood agar base, at 37°C for 
24h. Genomic DNA was extracted using the 
DNeasy blood and tissue kit (Qiagen, Germany), 
according to the manufacturer's recommended 
protocol. The quantity of DNA was measured by 
the NanoDrop Spectrophotometer and Cubit. Then 
lOng of DNA was sent to the State Key Laboratory 
for Diagnosis and Treatment of Infectious Disease 
at Zhejiang University for sequencing on a 
Hiseq2000 (lllumina, CA) sequencer. 

Genome sequencing and assembly 

One DNA library was generated [500 bp insert 
size, with the lllumina adapter at both ends, de- 
tected by Agilent DNA analyzer 2100), then se- 
quencing was performed by using an lllumina 
Hieseq 2000 genomic sequencer, with a 2x100 
pair end sequencing strategy. A total of 1,103 M 
bp of sequence data was produced which was as- 
sessed for quality by the following criteria: 1) 
Reads linked to adapters at both end were consid- 
ered as sequencing artifacts then removed. 2) Ba- 
ses with a quality index lower than Q20 at both 
ends were trimmed. 3) Reads with ambiguous ba- 
ses (N) were removed. 4) Single qualified reads 
were discarded [In this situation, one read is qual- 
ified but its mate is not). A total of 867.94 M clean 
filtered reads were assembled into scaffolds using 
the Velvet version 1.2.07 with parameters "- 
scaffolds no" [10], then we used a PAGIT flow [11] 



to prolong the initial contigs and correct sequenc- 
ing errors, to arrive at a set of improved scaffolds. 

Genome annotation 

Predict genes were identified using Glimmer ver- 
sion 3.0 [12],tRNAscan-SE version 1.21 [13] was 
used to find tRNA genes, whereas ribosomal RNAs 
were found by using RNAmmer version 1.2 [14]. 
To annotate predicted genes, we used HMMER 
version 3.0 [15], with parameters 'hmmscan -E 
0.01 -domainE 0.01' to ahgn genes against Pfam 
version 27.0 [16] [only pfam-A was used) to find 
genes with conserved domains. The KAAS server 
[17] was used to assign translated amino acids 
[with genetic code table 11) into KEGG Orthology 
with SBH [single-directional best hit) method. 
Translated genes were aligned with the COG data- 
base using NCBI blastp [hits should have scores no 
less than 60, e-value is no more than le-6). To find 
genes with hypothetical or putative function, we 
aligned genes against NCBI nucleotide sequence 
database [nt database was downloaded at Sep 20, 
2013) by using NCBI blastn, only if hits have an 
identity of no less than 0.95, coverage no less than 
0.9, and the reference gene had an annotation of 
putative or hypothetical. To define genes with sig- 
nal peptide, we use signalp version 4.1 [18] to 
identify genes with signal peptide with default pa- 
rameters except " -t gram+ ". TMHMM2.0 [19] was 
used to identify genes with transmembrane heli- 
ces. 



http://standardsingenomics.org 



759 



Staphylococcus cohnii subsp. cohnii 



Genome properties 

The draft genome sequence of S. conhii subsp. nit rRNAs). A total of 1,840 protein-coding genes 

cohnii strain hu-01 revealed a genome size of were assigned as putative function or hypothetical 

5,761,489 bp and a G+C content of 34.85% (521 proteins. 3,734 genes were categorized into COGs 

scaffolds with N50 is 39,926 bp]. These scaffolds functional groups. The properties and the statis- 

contain 5,820 coding sequences (CDSs), 61 tRNAs tics of the genome are summarized in Table 3 and 

(excluding 6 Pseudo tRNAs) and incomplete rRNA Table 4. 
operons (10 small subunit rRNA and 3 large subu- 



Table 3. Genome statistics of 5. cohnii subsp. cohnii strain hu-01 


Attribute 






Value % of total" 


Genome size (bp) 




5,761,489 


DNA codin 


g region (bp) 




4,751,472 82.469 


DNA G+C content (bp) 




1,697,984 29.471 


Total genes 






5,833 


RNA genes 






13 0.221 


Protein-coding genes 




5,820 99.777 


Genes with function prediction 




1,840 31.544 


Genes assig 


;ned to COGs 




3,734 64.015 


Genes assig 


;ned to Pfam domains 


4,943 84.741 


Genes with 


signal peptides 




431 7.388 


Genes with transmembrane helices 


1,629 27.927 


The total is 


based on either size of the genome in base pairs or total number of genes in the annotated genome. 


Table 4. Number of genes associated 


with the ] 


general COG functional categories 


Code 


Value' 


%age'' 


Description 


J 


230 


3.95 


Translation, ribosomal structure and biogenesis 


K 


452 


7.77 


Transcription 


L 


184 


3.16 


Replication, recombination and repair 


B 


3 


0.05 


Chromatin structure and dynamics 


D 


72 


1.24 


Cell cycle control, cell division, chromosome partitioning 


V 


187 


3.21 


Defense mechanisms 


T 


238 


4.09 


Signal transduction mechanisms 


M 


254 


4.36 


Cell wall/membrane/envelope biogenesis 


N 


70 


1.20 


Cell motility 


Z 


1 


0.02 


Cytoskeleton 


w 


1 


0.02 


Extracellular structures 


u 


57 


0.98 


Intracellular trafficking, secretion, and vesicular transport 


o 


147 


2.53 


Posttranslational modification, protein turnover, chaperones 


c 


292 


5.02 


Energy production and conversion 


G 


384 


6.60 


Carbohydrate transport and metabolism 


E 


640 


11.0 


Amino acid transport and metabolism 


F 


140 


2.41 


Nucleotide transport and metabolism 


H 


234 


4.02 


Coenzyme transport and metabolism 


1 


165 


2.84 


Lipid transport and metabolism 


P 


389 


6.68 


Inorganic ion transport and metabolism 


Q 


197 


3.38 


Secondary metabolites biosynthesis, transport and catabolism 


R 


841 


14.45 


General function prediction only 


S 


403 


6.92 


Function unknown 


c 


483 


8.30 


Not archived in COGs 


__d 


1603 


27.54 


No hits 



For some genes, qualified alignments can occur with several genes belonging to different COG categories. In such 
cases only the best match to a single COG category is considered, b) The total is based on the total number of protein 
coding genes(5,820) in the annotated genome, c) These genes have alignments with reference genes archived in COG, 
but these reference genes do not have COG categories, d) Genes without a qualified hit to a reference genes. 
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Conclusion 

Staphylococcus cohnii ssp. cohnii are part of the 
normal flora of human skin and mucous mem- 
branes which, in particular conditions, may be- 
come opportunistic pathogens [4]. The genome 
sequence of Staphylococcus cohnii subsp. cohnii 
strain hu-01 will provide the basis to elucidate the 
molecular principles of host colonization and in- 
sight into the genetic background of this organ- 
ism's pathogenesis. 
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